查看原文
其他

故障诊断:12cR2 Flex ASM 环境中节点启动失败的诊断和分析

张维照 数据和云 2019-12-13

点击▲关注 “数据和云”   给公众号标星置顶

更多精彩 第一时间直达

作者 | 张维照,云和恩墨技术专家,Oracle ACEA,2006年起从事数据库管理工作,2009年转 Oracle,从事过多套 TB 级省级工商、医疗、交通、人社、电信运营等数据库维护优化工作,擅长Oracle 数据库性能问题的分析与解决,Oracle数据库故障分析,Oracle数据库升级迁移。



Flex ASM

在12c以前的版本数据库实例使用操作系统认证连接ASM实例,因为ASM CLIENT(DB INSTANCE)和ASM Server总是在同一个主机上, 从12c版本开始引入的FLEX ASM架构允许数据库实例可以和ASM运行在不同的主机中, 使用FLEX ASM user password文件认证, ASM 密码文件存储在ASM DISKGROUP中, 同时在创建Flex ASM时会默认创建ASM USER。Flex ASM 也支持oracle 12c前版本的rdbms, 同样也是建议使用的ASM架构.


ASM Network

在Flex ASM Oracle 12c引入了一种新类型network, 叫做ASM network.  这种network用于ASM和ASM CLIENT及所有节点间通信。集群中的所有ASM client可以访问所有ASM network,    也可以只配置一个network共用于支持private network和asm network。


ASM Listeners

ASM listener用来支持Flex ASM 访问, 为每个ASM network配置一组ASM listener, 每个ASM client数据库实例中最多将三个ASM listener地址注册为remote listeners,所有客户端连接都在整个ASM实例集中进行负载平衡. 默认名为ASMNET1LSNR_ASM


案例

有了上面的基础认识,开始分析最近遇到的一个FLEX ASM相关的案例, 这是一套12c R2 2-nodes RAC  DG 环境 , 在检查DG standby side时,发现Standby node2 instance未启动,是在standby node1 上接收并应用。  尝试启动实例2时发现了问题。

root@anbobstb02:~$crsctl start crs
CRS-4123Oracle High Availability Services has been started.

root@anbobstb02:~$crsctl check crs
CRS-4638Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager

grid@anbobstb02$ crsctl stat res -t -init
--------------------------------------------------------------------------------
Name    Target  State   Server    State details      Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.crf
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.ctssd
      1        ONLINE  ONLINE       anbobstb02                  OBSERVER,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE anbobstb02                  STABLE
ora.gipcd
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.gpnpd
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.mdnsd
      1        ONLINE  ONLINE       anbobstb02                  STABLE
ora.storage
      1        ONLINE  ONLINE       anbobstb02                  STABLE
--------------------------------------------------------------------------------

grid@anbobstb02:~$crsctl get cluster mode status
Cluster is running in "flex" mode


注意:
Flex ASM环境, 在启动NODE2 CRS时失败。

CRS 告警日志

2019-02-12 10:50:05.229 [OCSSD(51008)]CRS-1713: CSSD daemon is started in hub mode
2019-02-12 10:50:06.670000 +08:00
2019-02-12 10:50:06.670 [OCSSD(51008)]CRS-1707: Lease acquisition for node anbobstb02 number 2 completed
2019-02-12 10:50:07.756000 +08:00
2019-02-12 10:50:07.756 [OCSSD(51008)]CRS-1605: CSSD voting file is online: /dev/asm-disk55; details in /oracle/app/grid/diag/crs/anbobstb02/crs/trace/ocssd.trc.
2019-02-12 10:50:07.759 [OCSSD(51008)]CRS-1605: CSSD voting file is online: /dev/asm-disk52; details in /oracle/app/grid/diag/crs/anbobstb02/crs/trace/ocssd.trc.
2019-02-12 10:50:07.763 [OCSSD(51008)]CRS-1605: CSSD voting file is online: /dev/asm-disk51; details in /oracle/app/grid/diag/crs/anbobstb02/crs/trace/ocssd.trc.
2019-02-12 10:50:14.376000 +08:00
2019-02-12 10:50:14.376 [OCSSD(51008)]CRS-1601: CSSD Reconfiguration complete. Active nodes are anbobstb01 anbobstb02 .
2019-02-12 10:50:17.177000 +08:00
2019-02-12 10:50:17.176 [OCTSSD(55374)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 55374
2019-02-12 10:50:17.193 [OCSSD(51008)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.
2019-02-12 10:50:18.157 [OCTSSD(55374)]CRS-2403: The Cluster Time Synchronization Service on host anbobstb02 is in observer mode.
2019-02-12 10:50:19.266000 +08:00
2019-02-12 10:50:19.266 [OCTSSD(55374)]CRS-2407: The new Cluster Time Synchronization Service reference node is host anbobstb01.
2019-02-12 10:50:19.266 [OCTSSD(55374)]CRS-2401: The Cluster Time Synchronization Service started on host anbobstb02.
2019-02-12 10:50:35.725000 +08:00
2019-02-12 10:50:35.725 [ORAROOTAGENT(50588)]CRS-5019: All OCR locations are on ASM disk groups [OCRDG], 
and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/oracle/app/grid/diag/crs/anbobstb02/crs/trace/ohasd_orarootagent_root.trc".


Trace ohasd_orarootagent_root.trcadrci> show trace /oracle/app/grid/diag/crs/anbobstb02/crs/trace/ohasd_orarootagent_root.trc
2019-02-12 10:50:25.765 : AGFW:2530133760: {0:5:3} Agent sending reply for: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:4032019-02-12 10:50:25.765 : USRTHRD:2519627520: {0:5:3} Check: 0-12019-02-12 10:50:25.766 : AGFW:2530133760: {0:5:3} ora.cluster_interconnect.haip 1 1 state changed from: STARTING to: ONLINE2019-02-12 10:50:25.766 : AGFW:2530133760: {0:5:3} RECYCLE_AGENT attribute not found2019-02-12 10:50:25.766 : AGFW:2530133760: {0:5:3} Started implicit monitor for [ora.cluster_interconnect.haip 1 1] interval=30000 delay=300002019-02-12 10:50:25.766 : AGFW:2530133760: {0:5:3} Agent sending last reply for: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:4032019-02-12 10:50:25.768 : USRTHRD:2512111360: {0:5:3} got pubgrpdata, 1-8-2-2-22019-02-12 10:50:25.770 : USRTHRD:2512111360: {0:5:3} Completed 1 HAIP assignment, start complete2019-02-12 10:50:25.770 : USRTHRD:2512111360: {0:5:3} to verify inf event2019-02-12 10:50:25.813 : AGFW:2530133760: {0:5:3} Agent received the message: RESOURCE_START[ora.storage 1 1] ID 4098:4382019-02-12 10:50:25.813 : AGFW:2530133760: {0:5:3} Preparing START command for: ora.storage 1 12019-02-12 10:50:25.813 : AGFW:2530133760: {0:5:3} ora.storage 1 1 state changed from: OFFLINE to: STARTING2019-02-12 10:50:25.813 : AGFW:2530133760: {0:5:3} RECYCLE_AGENT attribute not found2019-02-12 10:50:25.813 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] (:CLSN00107:) clsn_agent::start {2019-02-12 10:50:25.814 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] StorageAgent::init NodeRole = 12019-02-12 10:50:25.814 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] StorageAgent::check NODEROLE_HUB getOCRdetails2019-02-12 10:50:25.832 : default:2519627520: clsvactversion:4: Retrieving Active Version from local storage.2019-02-12 10:50:25.840 :GIPCXCPT:2519627520: gipcInternalSetAttribute: failed during gipcInternalSetAttribute, ret gipcretInvalidAttribute (5)2019-02-12 10:50:25.840 :GIPCXCPT:2519627520: gipcSetAttributeNativeF [clscrsconGipcConnect : clscrscon.c : 655]: EXCEPTION[ ret gipcretInvalidAttribute (5) ] failure for obj 0x7f8664436460 [0000000000000955] { gipcEndpoint : localAddr ”, remoteAddr ”, numPend 0, numReady 0, numDone0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp (nil) status 13flags 0x20000000, flags-2 0x0, usrFlags 0x0 }, name ‘traceLevel’, val 0x7f86962cf004, len 4, flags 0x02019-02-12 10:50:25.855 : CLSNS:2519627520: clsns_SetTraceLevel:trace level set to 1.2019-02-12 10:50:25.859 : default:2519627520: Inited LSF context: 0x7f866453c5d02019-02-12 10:50:25.863 : CLSCRED:2519627520: clsCredCommonInit: Inited singleton credctx.2019-02-12 10:50:25.863 : CLSCRED:2519627520: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.2019-02-12 10:50:25.886 : USRTHRD:2519627520: {0:5:3} 8154 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS2019-02-12 10:50:25.889 : USRTHRD:2519627520: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS2019-02-12 10:50:25.924 : CLSCRED:2519627520: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.bb15e951dcbc4fc2ff3aec3bfe1f0424.root not found2019-02-12 10:50:25.924 : USRTHRD:2519627520: {0:5:3} 7872 Error 4 opening dom root in 0x7f866441d5902019-02-12 10:50:25.929 :GIPCXCPT:2519627520: gipcInternalSetAttribute: failed during gipcInternalSetAttribute, ret gipcretInvalidAttribute (5)2019-02-12 10:50:25.929 :GIPCXCPT:2519627520: gipcSetAttributeNativeF [clscrsconGipcConnect : clscrscon.c : 655]: EXCEPTION[ ret gipcretInvalid Attribute (5) ] failure for obj 0x7f86647145d0 [0000000000000fc1] { gipcEndpoint : localAddr ”, remoteAddr ”, numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp (nil) status 13flags 0x20000000, flags-2 0x0, usrFlags 0x0 }, name ‘traceLevel’, val 0x7f86962cf004, len 4, flags 0x02019-02-12 10:50:26.014 : AGFW:2743070784: Recvd request to shed the threads2019-02-12 10:50:26.014 :CLSFRAME:2743070784: TM [MultiThread] is changing desired thread # to 8. Current # is 92019-02-12 10:50:26.014 :CLSFRAME:2532235008: {0:1:5} Worker thread is exiting in TM [MultiThread] to meet the desired count of 8. New count is 82019-02-12 10:50:29.219 : USRTHRD:2519627520: {0:5:3} 7872 Error 4 opening dom root in 0x7f866465b6802019-02-12 10:50:29.222 :GIPCXCPT:2519627520: gipcInternalSetAttribute: failed during gipcInternalSetAttribute, ret gipcretInvalidAttribute (5)2019-02-12 10:50:29.222 :GIPCXCPT:2519627520: gipcSetAttributeNativeF [clscrsconGipcConnect : clscrscon.c : 655]: EXCEPTION[ ret gipcretInvalidAttribute (5) ] failure for obj 0x7f8664546cd0 [0000000000002068] { gipcEndpoint : localAddr ”, remoteAddr ”, numPend 0, numReady 0, numDone0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp (nil) status 13flags 0x20000000, flags-2 0x0, usrFlags 0x0 }, name ‘traceLevel’, val 0x7f86962cf004, len 4, flags 0x02019-02-12 10:50:34.664 : USRTHRD:2519627520: {0:5:3} 7872 Error 4 opening dom root in 0x7f86645abf502019-02-12 10:50:34.668 :GIPCXCPT:2519627520: gipcInternalSetAttribute: failed during gipcInternalSetAttribute, ret gipcretInvalidAttribute (5)2019-02-12 10:50:34.668 :GIPCXCPT:2519627520: gipcSetAttributeNativeF [clscrsconGipcConnect : clscrscon.c : 655]: EXCEPTION[ ret gipcretInvalid Attribute (5) ] failure for obj 0x7f8664707b20 [000000000000392d] { gipcEndpoint : localAddr ”, remoteAddr ”, numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp (nil) status 13flags 0x20000000, flags-2 0x0, usrFlags 0x0 }, name ‘traceLevel’, val 0x7f86962cf004, len 4, flags 0x02019-02-12 10:50:35.715 : USRTHRD:2509133568: HAIP: event GIPCD_METRIC_UPDATE2019-02-12 10:50:35.715 : USRTHRD:2512111360: {0:5:3} to verify inf event2019-02-12 10:50:35.724 : default:2519627520: clsCredDomClose: Credctx deleted 0x7f866443f8402019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} — trace dump on error exit —2019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} Error [kgfoAl06] in [kgfokge] at kgfo.c:31152019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} ORA-12547: TNS:lost contactORA-12547: TNS:lost contactORA-15077: could not locate ASM instance serving a required diskgroup
2019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} Category: 72019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} DepInfo: 125472019-02-12 10:50:35.724 : USRTHRD:2519627520: {0:5:3} — trace dump end —2019-02-12 10:50:35.724 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] StorageAgent::parsekgforetcodes retcode = 7, kgfoCheckMount(OCRDG), flag 22019-02-12 10:50:35.724 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] (null) category: 7, operation: kgfoAl06, loc: kgfokge, OS error: 12547, other: ORA-12547: TNS:lost contactORA-12547: TNS:lost contactORA-15077: could not locate ASM instance serving a required diskgroup2019-02-12 10:50:35.724 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] StorageAgent::check kgfo returncode 12019-02-12 10:50:35.724 :CLSDYNAM:2519627520: [ora.storage]{0:5:3} [start] (:CLSN00140:)StorageAgent::parsekgforretcodes OCR dgName OCRDG state 1
Note:从日志看应该是在CRS启动时没有发现ASM DISKGROUP, asm 启动时在取asm 认证证数时出错,提示是ora-12547和ora-15055, Flex ASM中ASM server启动时要连接所有asm network. 下一步检查NODE1 的ASM listener.
grid@anbobstb01:/home/grid> crsctl stat res -t--------------------------------------------------------------------------------Name Target State Server State details --------------------------------------------------------------------------------Local Resources--------------------------------------------------------------------------------ora.ARCHDG.dg ONLINE ONLINE anbobstb01 STABLEora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE anbobstb01 STABLEora.DATADG.dg ONLINE ONLINE anbobstb01 STABLEora.LISTENER.lsnr ONLINE ONLINE anbobstb01 STABLEora.MGMT.dg ONLINE ONLINE anbobstb01 STABLEora.OCRDG.dg ONLINE ONLINE anbobstb01 STABLEora.chad ONLINE ONLINE anbobstb01 STABLEora.net1.network ONLINE ONLINE anbobstb01 STABLEora.ons ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE anbobstb01 STABLEora.MGMTLSNR 1 ONLINE ONLINE anbobstb01 169.254.143.82 192.1 68.43.33,STABLEora.asm 1 ONLINE ONLINE anbobstb01 Started,STABLE 2 ONLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLEora.cvu 1 ONLINE ONLINE anbobstb01 STABLEora.mgmtdb 1 ONLINE ONLINE anbobstb01 Open,STABLEora.anbobstb01.vip 1 ONLINE ONLINE anbobstb01 STABLEora.anbobstb02.vip 1 ONLINE INTERMEDIATE anbobstb01 FAILED OVER,STABLEora.qosmserver 1 ONLINE ONLINE anbobstb01 STABLEora.rptstby.db 1 OFFLINE OFFLINE Instance Shutdown,ST ABLE 2 ONLINE ONLINE anbobstb01 Open,Readonly,HOME=/ oracle/app/oracle/pr oduct/12.2.0/db_1,ST ABLEora.scan1.vip 1 ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------grid@anbobstb01:/home/grid> crsctl stat res -t -init--------------------------------------------------------------------------------Name  Target  State  Server   State details Cluster Resources--------------------------------------------------------------------------------ora.asm 1 ONLINE ONLINE anbobstb01 Started,STABLEora.cluster_interconnect.haip 1 ONLINE ONLINE anbobstb01 STABLEora.crf 1 ONLINE ONLINE anbobstb01 STABLEora.crsd 1 ONLINE ONLINE anbobstb01 STABLEora.cssd 1 ONLINE ONLINE anbobstb01 STABLEora.cssdmonitor 1 ONLINE ONLINE anbobstb01 STABLEora.ctssd 1 ONLINE ONLINE anbobstb01 OBSERVER,STABLEora.diskmon 1 OFFLINE OFFLINE STABLEora.drivers.acfs 1 ONLINE ONLINE anbobstb01 STABLEora.evmd 1 ONLINE ONLINE anbobstb01 STABLEora.gipcd 1 ONLINE ONLINE anbobstb01 STABLEora.gpnpd 1 ONLINE ONLINE anbobstb01 STABLEora.mdnsd 1 ONLINE ONLINE anbobstb01 STABLEora.storage 1 ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------
grid@anbobstb01:/home/grid> ocrdump /tmp/ocr.dmpPROT-310: Not all keys were dumped due to permissions.grid@anbobstb01:/home/grid> vi /tmp/ocr.dmp
[SYSTEM.ASM.CREDENTIALS.USERS.CRSUSER__ASM_001]ORATEXT : bb15e951dcbc4fc2ff3aec3bfe1f0424:grid --credentials is existSECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_NONE, USER_NAME : grid, GROUP_NAME : oinstall}
grid@anbobstb01:~$oifcfg getifbond0 133.96.43.0 global publicbond1 192.168.43.0 global cluster_interconnect,asm
grid@anbobstb01:/oracle/app/12.2.0/grid/bin> ps -ef|grep lsnrgrid 1411 1 0 2018 ? 00:01:53 /oracle/app/12.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -no_crs_notify -inheritgrid 6585 1 0 2018 ? 00:00:21 /oracle/app/12.2.0/grid/bin/tnslsnr listener_dg -inheritgrid 25826 19172 0 10:57 pts/3 00:00:00 grep --color=auto lsnrgrid 72783 1 0 2018 ? 00:02:01 /oracle/app/12.2.0/grid/bin/tnslsnr MGMTLSNR -no_crs_notify -inheritgrid 78242 1 0 Feb12 ? 00:00:05 /oracle/app/12.2.0/grid/bin/tnslsnr LISTENER -no_crs_notify -inheritgrid 80521 1 0 Feb12 ? 00:00:35 /oracle/app/12.2.0/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
grid@anbobstb01:/oracle/app/12.2.0/grid/bin> lsnrctl status ASMNET1LSNR_ASMLSNRCTL for Linux: Version 12.2.0.1.0 - Production on 13-FEB-2019 10:57:43Copyright (c) 1991, 2016, Oracle. All rights reserved.Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))STATUS of the LISTENER------------------------Alias ASMNET1LSNR_ASMVersion TNSLSNR for Linux: Version 12.2.0.1.0 - ProductionStart Date 12-FEB-2019 18:21:02Uptime 0 days 16 hr. 36 min. 40 secTrace Level offSecurity ON: Local OS AuthenticationSNMP OFFListener Parameter File /oracle/app/12.2.0/grid/network/admin/listener.oraListener Log File /oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/alert/log.xmlListening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.43.33)(PORT=1526)))The listener supports no servicesThe command completed successfully
grid@anbobstb01:/oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/trace> vi asmnet1lsnr_asm.log
2019-02-13T10:57:16.203184+08:00Incoming connection from 192.168.43.34 rejected13-FEB-2019 10:57:16 * 12546TNS-12546: TNS:permission denied TNS-12560: TNS:protocol adapter error TNS-00516: Permission denied  grid@anbobstb01:/oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/trace> telnet 192.168.43.33 1526Trying 192.168.43.33...Connected to 192.168.43.33.Escape character is '^]'.Connection closed by foreign host.


注意:
从日志看应该是在CRS启动时没有发现ASM DISKGROUP,  asm 启动时在asm 认证时出错,提示是ora-12547和ora-15055, Flex ASM中ASM server启动时要连接所有asm network.  下一步检查NODE1 的ASM listener.

grid@anbobstb01:/home/grid> crsctl stat res -t--------------------------------------------------------------------------------Name   Target  State   Server     State detaiLocal Resources--------------------------------------------------------------------------------ora.ARCHDG.dg ONLINE ONLINE anbobstb01 STABLEora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE anbobstb01 STABLEora.DATADG.dg ONLINE ONLINE anbobstb01 STABLEora.LISTENER.lsnr ONLINE ONLINE anbobstb01 STABLEora.MGMT.dg ONLINE ONLINE anbobstb01 STABLEora.OCRDG.dg ONLINE ONLINE anbobstb01 STABLEora.chad ONLINE ONLINE anbobstb01 STABLEora.net1.network ONLINE ONLINE anbobstb01 STABLEora.ons ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE anbobstb01 STABLEora.MGMTLSNR 1 ONLINE ONLINE anbobstb01 169.254.143.82 192.1 68.43.33,STABLEora.asm 1 ONLINE ONLINE anbobstb01 Started,STABLE 2 ONLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLEora.cvu 1 ONLINE ONLINE anbobstb01 STABLEora.mgmtdb 1 ONLINE ONLINE anbobstb01 Open,STABLEora.anbobstb01.vip 1 ONLINE ONLINE anbobstb01 STABLEora.anbobstb02.vip 1 ONLINE INTERMEDIATE anbobstb01 FAILED OVER,STABLEora.qosmserver 1 ONLINE ONLINE anbobstb01 STABLEora.rptstby.db 1 OFFLINE OFFLINE Instance Shutdown,ST ABLE 2 ONLINE ONLINE anbobstb01 Open,Readonly,HOME=/ oracle/app/oracle/pr oduct/12.2.0/db_1,ST ABLEora.scan1.vip 1 ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------grid@anbobstb01:/home/grid> crsctl stat res -t -init--------------------------------------------------------------------------------Name Target State Server State details --------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.asm 1 ONLINE ONLINE anbobstb01 Started,STABLEora.cluster_interconnect.haip 1 ONLINE ONLINE anbobstb01 STABLEora.crf 1 ONLINE ONLINE anbobstb01 STABLEora.crsd 1 ONLINE ONLINE anbobstb01 STABLEora.cssd 1 ONLINE ONLINE anbobstb01 STABLEora.cssdmonitor 1 ONLINE ONLINE anbobstb01 STABLEora.ctssd 1 ONLINE ONLINE anbobstb01 OBSERVER,STABLEora.diskmon 1 OFFLINE OFFLINE STABLEora.drivers.acfs 1 ONLINE ONLINE anbobstb01 STABLEora.evmd 1 ONLINE ONLINE anbobstb01 STABLEora.gipcd 1 ONLINE ONLINE anbobstb01 STABLEora.gpnpd 1 ONLINE ONLINE anbobstb01 STABLEora.mdnsd 1 ONLINE ONLINE anbobstb01 STABLEora.storage 1 ONLINE ONLINE anbobstb01 STABLE--------------------------------------------------------------------------------
grid@anbobstb01:/home/grid> ocrdump /tmp/ocr.dmpPROT-310: Not all keys were dumped due to permissions.grid@anbobstb01:/home/grid> vi /tmp/ocr.dmp
[SYSTEM.ASM.CREDENTIALS.USERS.CRSUSER__ASM_001]ORATEXT : bb15e951dcbc4fc2ff3aec3bfe1f0424:grid --credentials is existSECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_NONE, USER_NAME : grid, GROUP_NAME : oinstall}
grid@anbobstb01:~$oifcfg getifbond0 133.96.43.0 global publicbond1 192.168.43.0 global cluster_interconnect,asm
grid@anbobstb01:/oracle/app/12.2.0/grid/bin> ps -ef|grep lsnrgrid 1411 1 0 2018 ? 00:01:53 /oracle/app/12.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -no_crs_notify -inheritgrid 6585 1 0 2018 ? 00:00:21 /oracle/app/12.2.0/grid/bin/tnslsnr listener_dg -inheritgrid 25826 19172 0 10:57 pts/3 00:00:00 grep --color=auto lsnrgrid 72783 1 0 2018 ? 00:02:01 /oracle/app/12.2.0/grid/bin/tnslsnr MGMTLSNR -no_crs_notify -inheritgrid 78242 1 0 Feb12 ? 00:00:05 /oracle/app/12.2.0/grid/bin/tnslsnr LISTENER -no_crs_notify -inheritgrid 80521 1 0 Feb12 ? 00:00:35 /oracle/app/12.2.0/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
grid@anbobstb01:/oracle/app/12.2.0/grid/bin> lsnrctl status ASMNET1LSNR_ASMLSNRCTL for Linux: Version 12.2.0.1.0 - Production on 13-FEB-2019 10:57:43Copyright (c) 1991, 2016, Oracle. All rights reserved.Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))STATUS of the LISTENER------------------------Alias ASMNET1LSNR_ASMVersion TNSLSNR for Linux: Version 12.2.0.1.0 - ProductionStart Date 12-FEB-2019 18:21:02Uptime 0 days 16 hr. 36 min. 40 secTrace Level offSecurity ON: Local OS AuthenticationSNMP OFFListener Parameter File /oracle/app/12.2.0/grid/network/admin/listener.oraListener Log File /oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/alert/log.xmlListening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.43.33)(PORT=1526)))The listener supports no servicesThe command completed successfully
grid@anbobstb01:/oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/trace> vi asmnet1lsnr_asm.log
2019-02-13T10:57:16.203184+08:00Incoming connection from 192.168.43.34 rejected13-FEB-2019 10:57:16 * 12546TNS-12546: TNS:permission denied TNS-12560: TNS:protocol adapter error TNS-00516: Permission denied  grid@anbobstb01:/oracle/app/grid/diag/tnslsnr/anbobstb01/asmnet1lsnr_asm/trace> telnet 192.168.43.33 1526Trying 192.168.43.33...Connected to 192.168.43.33.Escape character is '^]'.Connection closed by foreign host.


注意:
实例1上看当前使用是没问题的, 但是上面运行的ASM listener没有服务, 使用telnet 发现很快会被拒绝, 检查iptables 没有限制,使用tcpdump 发现是监听进程发出的reset package. 如果当前的ASM Listener没有服务,那么Flex ASM 集群间就没有办法通信。跟监听连接相关的限制可能是sqlnet.ora.

grid@anbobstb01:/oracle/app/12.2.0/grid/network/admin> vi sqlnet.ora
NAMES.DIRECTORY_PATH= (TNSNAMES,EZCONNECT)

ADR_BASE = /oracle/app/grid
TCP.VALIDNODE_CHECKING=yes
TCP.INVITED_NODES=(...)


注意:
发现果然有sqlnet.ora中配置白名单,但是sqlnet.ora文件是从primary database复制过来的, 而primary和standby的Private network(ASM network) 不是一个子网段,所以standby side的白名单中并没有ASM network, 而没有服务。


解决方法

解决起来就简单了,在sqlnet.ora中增加ASM network的网段值。 这里提醒下,以后增长监听白名单,记的除了前端应用IP,还要加PUBLIC NETWORK, PRIVATE NETWORK, SCAN IP, ASM NETWORK..


原创:张维照

(点击“阅读原文”查看原文)


资源下载

关注公众号:数据和云(OraNews)回复关键字获取

2018DTCC , 数据库大会PPT

2018DTC,2018 DTC 大会 PPT

ENMOBK《Oracle性能优化与诊断案例》

DBALIFE ,“DBA 的一天”海报

DBA04 ,DBA 手记4 电子书

122ARCH ,Oracle 12.2体系结构图

2018OOW ,Oracle OpenWorld 资料

产品推荐

云和恩墨Bethune Pro企业版,集监控,巡检,安全于一身,你的专属数据库实时监控和智能巡检平台,漂亮的不像实力派,你值得拥有!



云和恩墨zData一体机现已发布超融合版本和精简版,支持各种简化场景部署,零数据丢失备份一体机ZDBM也已发布,欢迎关注。



    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存